101 research outputs found

    Convolutional Neural Networks for Water segmentation using Sentinel-2 Red, Green, Blue (RGB) composites and derived Spectral Indices

    Get PDF
    Near-real time water segmentation with medium resolution satellite imagery plays a critical role in water management. Automated water segmentation of satellite imagery has traditionally been achieved using spectral indices. Spectral water segmentation is limited by environmental factors and requires human expertise to be applied effectively. In recent years, the use of convolutional neural networks (CNN’s) for water segmentation has been successful when used on high-resolution satellite imagery, but to a lesser extent for medium resolution imagery. Existing studies have been limited to geographically localized datasets and reported metrics have been benchmarked against a limited range of spectral indices. This study seeks to determine if a single CNN based on Red, Green, Blue (RGB) image classification can effectively segment water on a global scale and outperform traditional spectral methods. Additionally, this study evaluates the extent to which smaller datasets (of very complex pattern, e.g harbour megacities) can be used to improve globally applicable CNNs within a specific region. Multispectral imagery from the European Space Agency, Sentinel-2 satellite (10 m spatial resolution) was sourced. Test sites were selected in Florida, New York, and Shanghai to represent a globally diverse range of waterbody typologies. Region-specific spectral water segmentation algorithms were developed on each test site, to represent benchmarks of spectral index performance. DeepLabV3-ResNet101 was trained on 33,311 semantically labelled true-colour samples. The resulting model was retrained on three smaller subsets of the data, specific to New York, Shanghai and Florida. CNN predictions reached a maximum mean intersection over union result of 0.986 and F1-Score of 0.983. At the Shanghai test site, the CNN’s predictions outperformed the spectral benchmark, primarily due to the CNN’s ability to process contextual features at multiple scales. In all test cases, retraining the networks to localized subsets of the dataset improved the localized region’s segmentation predictions. The CNN’s presented are suitable for cloud-based deployment and could contribute to the wider use of satellite imagery for water management

    How Am I Doing?: Evaluating Conversational Search Systems Offline

    Get PDF
    As conversational agents like Siri and Alexa gain in popularity and use, conversation is becoming a more and more important mode of interaction for search. Conversational search shares some features with traditional search, but differs in some important respects: conversational search systems are less likely to return ranked lists of results (a SERP), more likely to involve iterated interactions, and more likely to feature longer, well-formed user queries in the form of natural language questions. Because of these differences, traditional methods for search evaluation (such as the Cranfield paradigm) do not translate easily to conversational search. In this work, we propose a framework for offline evaluation of conversational search, which includes a methodology for creating test collections with relevance judgments, an evaluation measure based on a user interaction model, and an approach to collecting user interaction data to train the model. The framework is based on the idea of “subtopics”, often used to model novelty and diversity in search and recommendation, and the user model is similar to the geometric browsing model introduced by RBP and used in ERR. As far as we know, this is the first work to combine these ideas into a comprehensive framework for offline evaluation of conversational search

    Learning Neural Point Processes with Latent Graphs

    Get PDF
    Neural point processes (NPPs) employ neural networks to capture complicated dynamics of asynchronous event sequences. Existing NPPs feed all history events into neural networks, assuming that all event types contribute to the prediction of the target type. How- ever, this assumption can be problematic because in reality some event types do not contribute to the predictions of another type. To correct this defect, we learn to omit those types of events that do not contribute to the prediction of one target type during the formulation of NPPs. Towards this end, we simultaneously consider the tasks of (1) finding event types that contribute to predictions of the target types and (2) learning a NPP model from event se- quences. For the former, we formulate a latent graph, with event types being vertices and non-zero contributing relationships being directed edges; then we propose a probabilistic graph generator, from which we sample a latent graph. For the latter, the sampled graph can be readily used as a plug-in to modify an existing NPP model. Because these two tasks are nested, we propose to optimize the model parameters through bilevel programming, and develop an efficient solution based on truncated gradient back-propagation. Experimental results on both synthetic and real-world datasets show the improved performance against state-of-the-art baselines. This work removes disturbance of non-contributing event types with the aid of a validation procedure, similar to the practice to mitigate overfitting used when training machine learning models

    Subsequence Based Deep Active Learning for Named Entity Recognition

    Get PDF
    Active Learning (AL) has been successfully applied to Deep Learning in order to drastically reduce the amount of data required to achieve high performance. Previous works have shown that lightweight architectures for Named Entity Recognition (NER) can achieve optimal performance with only 25% of the original training data. However, these methods do not exploit the sequential nature of language and the heterogeneity of uncertainty within each instance, requiring the labelling of whole sentences. Additionally, this standard method requires that the annotator has access to the full sentence when labelling. In this work, we overcome these limitations by allowing the AL algorithm to query subsequences within sentences, and propagate their labels to other sentences. We achieve highly efficient results on OntoNotes 5.0, only requiring 13% of the original training data, and CoNLL 2003, requiring only 27%. This is an improvement of 39% and 37% compared to querying full sentences

    StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

    Get PDF
    Inferring spatial relations in natural language is a crucial abil- ity an intelligent system should possess. The bAbI dataset tries to capture tasks relevant to this domain (task 17 and 19). However, these tasks have several limitations. Most impor- tantly, they are limited to fixed expressions, they are limited in the number of reasoning steps required to solve them, and they fail to test the robustness of models to input that contains irrelevant or redundant information. In this paper, we present a new Question-Answering dataset called StepGame for ro- bust multi-hop spatial reasoning in texts. Our experiments demonstrate that state-of-the-art models on the bAbI dataset struggle on the StepGame dataset. Moreover, we propose a Tensor-Product based Memory-Augmented Neural Network (TP-MANN) specialized for spatial reasoning tasks. Experi- mental results on both datasets show that our model outper- forms all the baselines with superior generalization and ro- bustness performance

    Forecasting Solar Home System Customers’ Electricity Usage with a 3D Convolutional Neural Network to Improve Energy Access

    Get PDF
    Off-grid technologies, such as solar home systems (SHS), offer the opportunity to alleviate global energy poverty, providing a cost-effective alternative to an electricity grid connection. However, there is a paucity of high-quality SHS electricity usage data and thus a limited understanding of consumers’ past and future usage patterns. This study addresses this gap by providing a rare large-scale analysis of real-time energy consumption data for SHS customers (n = 63,299) in Rwanda. Our results show that 70% of SHS users’ electricity usage decreased a year after their SHS was installed. This paper is novel in its application of a three-dimensional convolutional neural network (CNN) architecture for electricity load forecasting using time series data. It also marks the first time a CNN was used to predict SHS customers’ electricity consumption. The model forecasts individual households’ usage 24 h and seven days ahead, as well as an average week across the next three months. The last scenario derived the best performance with a mean squared error of 0.369. SHS companies could use these predictions to offer a tailored service to customers, including providing feedback information on their likely future usage and expenditure. The CNN could also aid load balancing for SHS based microgrids

    Mitigating the Position Bias of Transformer Models in Passage Re-Ranking

    Get PDF
    Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking. The excessive favoring of earlier positions inside passages is an unwanted artefact. This leads to three common Transformer-based re-ranking models to ignore relevant parts in unseen passages. More concerningly, as the evaluation set is taken from the same biased distribution, the models overfitting to that bias overestimate their true effectiveness. In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results. We propose a debiasing method for retrieval datasets. Our results show that a model trained on a position-biased dataset exhibits a significant decrease in re-ranking effectiveness when evaluated on a debiased dataset. We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset, as well as more effective in a transfer-learning setting between two differently biased datasets

    Inference of virtual network functions' state via analysis of the CPU behavior

    Get PDF
    The on-going process of softwarization of IT networks promises to reduce the operational and management costs of network infrastructures by replacing hardware middleboxes with equivalent pieces of code executed on general-purpose servers. Alongside the benefits from the operator’s perspective, new strategies to provide the network’s resources to users are arising. Following the principle of “everything as a service”, multiple tenants can access the required resources – typically CPUs, NICs, or RAM – according to a Service-Level Agreement. However, tenants’ applications may require a complex and expensive measurement infrastructure to continuously monitor the network function’s state. Although the application’s specific behavior is unknown (and often opaque to the infrastructure owner), the software nature of (virtual) network functions (VNFs) may be the key to infer the behavior of the high-level functions by accessing low-level information, which is still under the control of the operating system and therefore of the infrastructure owner. As such, in the scenario of software VNFs executed on COTS servers, the underlying CPU’s behavior can be used as the sole predictor for the high-level VNF state without explicit in-network measurements: in this paper, we develop a novel methodology to infer high-level characteristics such as throughput or packet loss using CPU data instead of network measurements. Our methodology consists of (i) experimentally analyzing the behavior of a CPU that executes a VNF under different loads, (ii) extracting a correlation between the CPU footprint and the highlevel application state, and (iii) use this knowledge to detect the previously mentioned network metrics. Our code and datasets are publicly available

    Evaluation metrics for measuring bias in search engine results

    Get PDF
    Search engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the viewpoints of a search query topic, and they can be biased towards a specific view since search engine results are returned based on relevance, which is calculated using many features and sophisticated algorithms where search neutrality is not necessarily the focal point. Therefore, it is important to evaluate the search engine results with respect to bias. In this work we propose novel web search bias evaluation measures which take into account the rank and relevance. We also propose a framework to evaluate web search bias using the proposed measures and test our framework on two popular search engines based on 57 controversial query topics such as abortion, medical marijuana, and gay marriage. We measure the stance bias (in support or against), as well as the ideological bias (conservative or liberal). We observe that the stance does not necessarily correlate with the ideological leaning, e.g. a positive stance on abortion indicates a liberal leaning but a positive stance on Cuba embargo indicates a conservative leaning. Our experiments show that neither of the search engines suffers from stance bias. However, both search engines suffer from ideological bias, both favouring one ideological leaning to the other, which is more significant from the perspective of polarisation in our society

    Self-Attentive hawkes process

    Get PDF
    Capturing the occurrence dynamics is crucial to predicting which type of events will happen next and when. A common method to do this is through Hawkes processes. To enhance their capacity, recurrent neural networks (RNNs) have been incorporated due to RNNs successes in processing sequential data such as languages. Recent evidence suggests that self-Attention is more competent than RNNs in dealing with languages. However, we are unaware of the effectiveness of self-Attention in the context of Hawkes processes. This study aims to fill the gap by designing a self-Attentive Hawkes process (SAHP). SAHP employs self-Attention to summarise the influence of history events and compute the probability of the next event. One deficit of the conventional selfattention, when applied to event sequences, is that its positional encoding only considers the order of a sequence ignoring the time intervals between events. To overcome this deficit, we modify its encoding by translating time intervals into phase shifts of sinusoidal functions. Experiments on goodness-of-fit and prediction tasks show the improved capability of SAHP. Furthermore, SAHP is more interpretable than RNN-based counterparts because the learnt attention weights reveal contributions of one event type to the happening of another type. To the best of our knowledge, this is the first work that studies the effectiveness of self-Attention in Hawkes processes
    • …
    corecore